Structure determination of disordered materials from diffraction data 
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We show that the information gained in spectroscopic experiments regarding the number and 
distribution of atomic environments can be used as a valuable constraint in the refinement of the 
atomic-scale structures of nanostructured or amorphous materials from pair distribution function 
(PDF) data. We illustrate the effectiveness of this approach for three paradigmatic disordered 
systems: molecular Ceo, a-Si, and a-Si02. Much improved atomistic models are attained in each 
case without any a-priori assumptions regarding coordination number or local geometry. We propose 
that this approach may form the basis for a generalised methodology for structure "solution" from 
PDF data applicable to network, nanostructured and molecular systems alike. 
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Many materials of fundamental importance possess 
structures that do not exhibit long-range periodicity: ex- 
amples include metallic and covalent glasses [Tj, amor- 
phous biominerals [2], the so-called "phase-change" 
chalcogenides of DVD-RAM technology [3], and amor- 
phous semiconductors such as a-Si and a-Ge [1]. The 
absence of Bragg reflections in the diffraction patterns 
of these materials precludes the use of traditional crys- 
tallographic techniques as a means of determining their 
atomic-scale structures. Yet it is clear that these materi- 
als do possess well-defined local structure on the nanome- 
tre scale ; moreover it is often this local structure that 
is implicated in the particular physical properties of in- 
terest [B] . For this reason, the development of systematic 
information-based methodologies for the determination 
of local structure in disordered materials remains one of 
the key challenges in modern structural science [7]. 

Historically, local structure has been studied experi- 
mentally using two principal approaches: (i) the diffrac- 
tion techniques of neutron and x-ray total scattering, 
from which the distribution of interatomic separations 
can be measured via the pair distribution function (PDF) 
[5] , and (ii) resonance and spectroscopic methods (NMR, 
EXAFS, IR, Raman) that yield information concern- 
ing the number and population of distinct atomic en- 
vironments, together with (in favourable cases) metal- 
coordination/molecular geometries [TO]- These tech- 
niques afford a rich body of information, and over the 
past 5-10 years a number of sophisticated methods of 
analysis have emerged that aim to derive structural mod- 
els via fitting to these experimental data. The Reverse 
Monte Carlo (RMC) [TT] and Empirical Potential Struc- 
ture Refinement (EPSR) [T^] methods have been used 
widely in the glass and amorphous materials community, 
while the PDFfit _13J and "Liga" [M] methods have been 
applied more recently to nanostructured solids such as 
Geo [71 and ferrihydrite [TS] — systems that present simi- 



lar crystallographic challenges. 

There is, however, a fundamental problem: markedly 
different structural models can be equally consistent with 
the same PDF data [16 . Moreover, the task of fitting si- 
multaneously to PDF and spectroscopic data is almost 
always either too computationally demanding or in fact 
not quantitatively possible. Taken together, these fac- 
tors have meant that it is often difficult to determine the 
atomic-level structure of these materials, and that there 
is no "routine" information-based approach analogous to 
those for crystalline materials. 

In this Letter, we show that this problem can 
largely be solved by using information gained via 
spectroscopy — the number and population of distinct 
atomic environments — to guide refinement of experimen- 
tal PDF data. Structural refinement based on repro- 
ducing the experimental PDF alone is, in general, not 
sufficiently well-constrained to produce models that re- 
fiect the "true" local structure in a material; however, 
if refinement is forced also to reflect the correct number 
and distribution of atom environments then convergence 
on the correct local structure usually follows. This ap- 
proach is easily implemented and generic. Moreover, we 
show that successful refinement can be initiated using 
entirely random atomistic models and, in being driven 
wholly by experimental data, one avoids any other a- 
priori assumptions concerning e.g. coordination numbers 
or geometries. While our focus lies on proof-of-principle 
at this stage, our results show that routine information- 
based structure determination of disordered materials is 
now a viable prospect. 

Our paper is arranged as follows. We begin by de- 
scribing the particular implementation of our methodol- 
ogy through a "variance" -based term in the cost func- 
tion used to drive PDF refinement. Three principal case 
studies follow: nanoparticulate Geo (single cluster; one 
atom environment), amorphous silicon (continuous net- 
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work solid; one atom environment) and amorphous silica 
(continuous network solid; two atom environments). In 
all three instances we show that a conventional RMC 
approach fails to obtain the correct structure solution — 
often spectacularly — but that inclusion of the variance 
term is sufficient to recover almost-perfect models of ma- 
terial structure in each case. We conclude by discussing a 
number of different possible implementations of our un- 
derlying methodology. 

In outlining our methodology, it is useful to consider 
first the simplest type of disordered material: namely a 
phase that contains a single atom type and for which 
spectroscopy indicates a single atom environment. The 
existence of a single atomic environment demands that 
structural correlation functions calculated for different 
individual atoms within the material should take similar 
forms. In order to recast this statement with specific ref- 
erence to the PDF, we first define atomic PDFs Pj{r) for 
an atomistic model such that the "bulk" (measurable) 
PDF G(r) corresponds to the average (p(r)) taken over 
all atoms j. Then the existence of a single atom envi- 
ronment dictates a similarity Pj(r) ~ Pj'{T) ~ G{r) for 
all atoms Whereas a standard PDF-based struc- 

ture refinement would involve minimising a function of 
the form 
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FIG. 1: RMC refinement of the experimental PDF of Ceo: (a) 
the neutron PDF itself, corrected to remove inter-molecular 
correlations as described in Ref. 1141 (b) a random starting 
configuration of 60 carbon atoms; (c) a typical configura- 
tion produced by conventional RMC refinement of either ide- 
alised or experimental PDF data; and those produced by IN- 
VERT+RMC using (d) idealised, and (e) experimental PDF 
data. In panels (b)-(e), atoms with three nearest neighbours 
are coloured blue and others are coloured red. 
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what we would propose is the alternative cost function 
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Note that in this reformulation one obtains = if 
and only if the model PDF matches G'expt('') and each 
individual Pj(r) has the same form. It is straightforward 
to show that the new penalty function of ^ is in 
fact equal to that of ([!]) plus a variance term Xvar — 
{p{r)^) — (p(r))^. What the spectroscopic result suggests 
is to add to a conventional PDF refinement a term that 
penalises variance in local coordination environments; for 
this reason we are terming our approach an INVariant 
Environment Refinement Technique (INVERT). 

In practice, the individual Pj{r) for a static atomistic 
model consist of a series of delta functions, and in order 
to obtain a well-behaved variance term, it is necessary to 
adopt a modified formulation such as: 

2 ^ 1 W [d,{l)~{d{z))f 
3 J 

where djii) measures the distance from atom j to its i-th 
neighbour, and {d{i)) is the average such distance over 



all atoms j. The term in the denominator of Eq. ([3]) ap- 
pears in order to account for the fact that the number of 
neighbours at a distance d scales with [T7] . The exten- 
sion to multiple atom types and/or atom environments 
is straightforward. A separate variance term is included 
for each different pair of atom types (A and B, say); the 
form of each individual term is the same as in Eq. (§ 
except that the dj{i) will refer to i-th neighbour of type 
B around the j-th atom of type A, and so on. 

At this point we emphasise that no assumption has 
been made regarding the actual distribution of neigh- 
bours around each atom — only that this distribution 
should be similar for equivalent atoms. Moreover, we are 
able to constrain the partial PDF functions for multi- 
component systems despite the experimental PDF data 
representing a sum over these separate contributions. 

We have chosen Cgo as a simple first case study, 
not least because the task of determining its well- 
known icosahedral structure from the experimental PDF 
[Fig. [TJa)] has recently been highlighted as a bench- 
mark challenge in nanostructure determination [T^. As 
straightforward as the task might seem, conventional 
RMC refinement from a random starting configuration 
[Fig. [T|b)] fails entirely, giving a set of small clusters 
that contains only a few of the real set of interatomic 
separations [Fig. [ijc)]. The same result is obtained even 
if idealised PDF data are used. 

The INVERT modification exploits the experimental 
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NMR result that Cgo contains a single C environment 
18]. Clearly the RMC configuration in Fig. [ijc) vio- 
lates this property, and so would now give rise to a large 
Xvar term that will help drive refinement forward. In- 
deed, INVERT+RMC refinement from the same random 
starting configuration gives the correct solution for ide- 
alised data [Fig. [ijd)] and a near-perfect solution for the 
experimental neutron PDF data of Ref . , 14. [Fig. [Tj^e)] . We 
note that such a result has only ever been achieved previ- 
ously using the highly-sophisticated cluster optimisation 
methods of the "Liga" algorithm or using genetic algo- 
rithms based on the principle of mating or crossover (the 
latter only giving correct solutions in 56 % of attempts) 
[ZmSlEQ]. Here, INVERT-hRMC consistently obtains a 
topologically-identical solution from random starting co- 
ordinates in approximately 2 000-4 000 accepted moves. 

We find the extension to a cluster with two atom 
environments — namely, S12 [H] — enjoys similar success 
[22] . Videos that illustrate the refinement process for 
Ceo and S12 are provided as supporting information [23] . 

The paradigmatic "stumbling block" for RMC, how- 
ever, has always been amorphous Si, whose structure 
is believed to consist of a continuous random network 
(CRN) of tetrahedral Si centres [21] . Rather than gener- 
ating a network of four-fold-coordinated Si atoms, RMC 
refinements of a-Si PDF data yield configurations with 
unphysically-broad distributions of Si coordination num- 
bers dS]. This is allowed because, crudely speaking, a 
pair of atoms of coordination numbers three and five will 
contribute to the average PDF indistinguishably from 
two four-fold coordinated atoms, and yet the former state 
is statistically more likely during a sequence of random 
moves. Various work-arounds have been proposed and 
implemented {e.g. constraining coordination numbers to 
equal four), and in the most favourable of cases these 
yield CRNs comparable to those obtained from bond- 
switching (Wooten-Weaire- Winer, "WWW" gS]) meth- 
ods, molecular dynamics and ab-initio calculations [24] . 

However, there is a sense in which one recovers from 
these approaches only the very information already used 
to generate the constraints: if the coordination number 
is constrained to equal four during refinement, then four- 
fold coordination cannot be considered an independent 
result. Consequently, our motivation for considering a- 
Si as a second case study was primarily to determine 
whether, by using the evidence for a single Si environ- 
ment from NMR studies [26], INVERT -hRMC refinement 
could yield reasonable structural models without recourse 
to explicit coordination number constraints. 

First, a conventional RMC refinement was performed 
using G{r) "data" generated from the trusted WWW 
model of Ref. '25', The starting configuration was a 
random collection of 512 Si atoms in a cubic box of 
side 21.7 A. Refinement gave a highly-disordered config- 
uration that displayed all the hallmarks of previously- 
described problematic RMC studies ^6j: only 27% of 
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FIG. 2: Comparison of a-Si configurations obtained using 
(left) "Native RMC" and (centre) "INVERT+RMC"' refine- 
ment with (right) the trusted "WWW" model of Ref. HI (a) 
Slices of the configurations themselves, with four-coordinate 
Si coloured blue and others coloured red. (b) The PDFs calcu- 
lated from each configuration, which are essentially identical, 
(c) Corresponding PDF variances calculated using Eq. ((3f. 

Si atoms are four-fold coordinated, there are substantial 
density variations, and large numbers of unphysical Sis 
"triangles" [left-hand panel of Fig. [2j[a)]. In contrast, a 
parallel INVERT -)-RMC refinement achieved an almost 
perfect coordination distribution (95% four- fold). The 
improvement extended even to the higher-order corre- 
lations (discussed in more detail below): in particular, 
the number of Sia triangles is halved, and the density 
distribution is much more even. Inspection of the config- 
uration itself [centre panel of Fig. [2ja)] now reveals obvi- 
ous similarities to the trusted WWW model [right-hand 
panel of Fig. ^a.)]. The PDF itself is relatively insensi- 
tive to this fundamental improvement in local structure 
modelling [Fig. [2](b)], while the variance term of Eq. ^ 
clearly acts a much better figure-of-merit [Fig. [2jc)]. 

Similar results are obtained for amorphous Si02, which 
is a conceptual extension in that it contains two distinct 
atom environments: that of the Si atoms and that of the 
O atoms. Experimental neutron PDF data were taken 
from Ref. 1271 and starting configurations generated from 
a random distribution of 64 Si atoms and 128 O atoms 
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FIG. 3: A slice of the information-driven CRN model of a- 
Si02 obtained using INVERT+RMC refinement as described 
in the text. Si and O atoms shown in dark and light colours, 
respectively; atoms in shades of blue have the expected coor- 
dination numbers of 4 (Si) and 2 (O), while the few in shades 
of red have incorrect coordination numbers. 

in a periodic cubic box of side 14.37 A. RMC refinement 
both with and without the INVERT modification gave 
excellent fits to the PDF, but the INVERT+RMC model 
had a much higher percentage of fourfold Si coordination 
(97% vs. 59% for the RMC-only configuration). Indeed, 
we believe the INVERT+RMC configuration to be the 
first information-based CRN model of a-Si02 [Fig.js]. 

The INVERT methodology is by no means applicable 
only to RMC refinement. Our focus on RMC in this 
Letter arises from a desire to demonstrate the effective- 
ness of the INVERT approach using a refinement method 
that is known to favour disorder. The incorporation of 
variance-based cost functions in any refinement approach 
is straightforward, and such a modification to more so- 
phisticated PDF fitting approaches than RMC, e.g. as 
suggested in Ref. might reasonably be expected to 
produce even more realistic configurations. 

Speaking more generally, we would note that the con- 
cept of local invariance encompasses more than minimis- 
ing the PDF variance alone. One can imagine, for exam- 
ple, that minimising the variance in higher-order correla- 
tion functions, such as angle distributions, coordination 
geometry, and CRN ring statistics might also improve re- 
finement further. Importantly, these constraints can be 
implemented despite the functions not being measurable 
experimentally. In practice, however, we have found that 
the calculation of higher-order correlation functions is too 
computationally-demanding for speedy refinement at this 
stage; the extension to constraining geometric invariance 
using spherical harmonics and/or the triplet distribution 
function is an approach we hope to pursue further in the 
near future. 
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